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Flag Value Renaming 
FIELD 

[0001] An embodiment of the invention relates to computer operation in 
general, and more specifically to flag value renaming. 

BACKGROUND 

[0002] Most computer architectures contain some type of flag register that 
contains a set of switches to control the operation of the machine. For example, an 
interrupt flag bit (IF bit) in a machine register may control whether or not interrupts are 

enabled in the machine. 

[0003] A register renamer (referred to an a "renamer" herein) may rename 
logical registers onto a processor's physical register file. The renaming process may 
allow a smaller, architecturally defined register file to be dynamically expanded to use a 
larger number of physical registers available in a processor. Renaming may be utiUzed to 
eliminate conflicts caused by multiple instructions creating simultaneous but unique 
versions of a register. A processor pipeline may include many different instances of a 

register at one time. 

[0004] However, complications may arise in the naming of certain flags. In 
certain instances, a flag may be set or cleared not only from an instruction, but also from 
the data path of a machine. For example, an IF bit may be set according to data loaded 
from memory, while a clear interrupt flag (CLI) instruction clears the IF flag. In 
conventional systems, a flag may therefore be non-renamed, thereby requiring 
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serialization and delay to order any writes and reads. In the alternative, a flag may be 
fully renamed, which may require excessive hardware. 
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BRIEF DESCMPnON OF THE DRAWINGS 
[0005] The invention may be best understood by referring to the following 
description and accompanying drawings that are used to illustrate embodiments of the 

invention. In the drawings: 

[0006] Figure 1 illustrates an embodiment of a register receiving flag values 

from multiple sources; 

[0007] Figure 2 illustrates an embodiment of a renaming architecture; 
[0008] Figure 3 is a flow chart for an embodiment of the invention; 
[0009] Figure 4 illustrates an embodiment of a processor; and 
[0010] Figure 5 illustrates an embodiment of a computer environment. 



Docket No: 42P 17033 

Express Mail No: EV 331619446 US 



-4- 



DETAILED DESCRIPTION 
[0011] A method and apparatus are described for flag value renaming. 
[0012] Under an embodiment of the invention, renaming of a flag register 
occurs without stalling all succeeding instructions to determine when there is a change in 
value of a flag value. According to the embodiment, stalling or delay of instructions is 
limited to instances in which the value of a flag is not known. If an instruction sets or 
clears a flag bit, then succeeding instructions utilizing the flag can proceed because the 
value of the flag bit is known. Delay may occur when the flag bit is set from a value 
from memory because the value of the flag bit is not known until the value is stored. 

[0013] Flag value renaming is a mechanism that allows for the tracking of flag 
values. There may be multiple sources of flag values. Flag values may be set or cleared 
by either an instruction that writes a value directly to the register, a direct write, or by 
data that is obtained from the data path of the machine, an indirect write. Under an 
embodiment of the invention, the setting of register values is accomplished by effectively 
executing the direct set or clear instructions at rename time. The register values for 
instructions that update the flags from the data path are updated at retirement. In a 
particular embodiment, in order to avoid hazards connected with inconsistent register 
values, scoreboarding may be used to serialize flag reads and writes. 

[0014] In one possible example, control flags to be set may include the IF 
(interrupt flag) register and the DF (direction flag, indicating whether values are 
incremented or decremented) register bits of the eflags register for the IA-32 
micro-architecture. The flags may be updated from two different types of instructions. 
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DiKC. apdaK msttuctiom can direcUy se, or clear me a[wria« flag- For example, 
direct update instructions may include: 

STI - set interrupt flag; 

CLI - clear interrupt flag; 

STD - set direction flag; and 

CLD - clear direction flag. 

[0015] In contras., an Mrec. update instruction reads data from the system 
datapath andupdatesaflag value basedupon that data. For example, "popfmayobta.n 

(or pop) a value from a memory stack and provide such value to the enags register, and 

thus a flag may be updated from the obtained data. 

[001*1 Underanembodimentoftheinvention,a.egistersco.eboardisu.edto 

n^ntaind^eoperaUon of registers. Theregister scoreboard may be udli^to maintain 

regist^ coherency by preventing parallel execudon units from »=cessing a ..gister if an 
outstanding operaUon is currently uUliring the register. When an instruction that t^cua 

parUcular,.gisteris executed, theprocessor may setasco^boardbit to indicatethatthe 

register is beingused in an operation.Has.«ceeding instruction requires the use of the 

register while the register is in use, as indicated by the scoreboard. U«n the instmcUon 
may be delayed until compleUon of the prior instmctio„.Ifasucceedi„g instruction does 

not requite data from a register ma. is in use, the processor may execute the instnrctions 
before the prior instmction has completed execution. If an instruction is stalled, later 
instructions may be issued and executed if the laKr instmcdons do not depend on any 
active or stalled instruction. 
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[00171 According to an embo<limc„. of fte invention, direct update instructions 
are effectively "executed" at the renan« by stoHng the correct data value in the «r. 
In one exan>ple. an STl instruction to set the interrupt flag «ould store a value of "1" 
(enablOin the IFbitinthetenamer. Any instructionthat needs to access the valueof IF 

would read the value fion, the renan>er a. rename time. A dhect update instruction that is 
.adressingaregisterv,mcheckthesco.eU«rd to determine it theregister is in use. It the 

scoteboatd bit tor the register is set, the instruction stalls unttl the sco«board bit is 

cleared. 

[0018] indirect update instructions set a scoreboard bit in the renamer and are 
processed through the tnachine normally. For example, it a popt instrucUon wH.es to IF 

data is stored in the ROB (reorder butter). When the popt retires, this value is 
vmtten into the renamer and is available for fumreinstrucUons that need ,0 read the IF 

flag. Indirect update instrucUons also check the serializing scoreboard. In addition, these 
instructions set theserializingscoreboatxlatrenametimeandclear this scoreboard when 

updating the XF value in the renamer at retirement. The scoreboard algorithm can ptevent 
RAW (read after wdte) and WAW (mite after write) stall hazards. 

[00191 Under an embodiment of the invention, recovery is provided fiom 
inootrect speculation such as branch mis-prediction. Ac^rding to one embodiment, the 
^very is provided by shadow logic. A process of flag value renaming has two different 
modes, comprising writes fromdirectinstructions and writesfrom indirect instructions. 

order to handle .hetwod.tferentmodes.avalidbit maybe addedto the Shadow logic 
u,i„dicate,he validity of data. -n^validbitenables shadowing tor ditectinstiu^tions and 
disables it tor inditectinstiuctions. Shadowing is disabledforinditect instructions 
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because such instructions do not update the values in the renamer until retirement, and 
thus the values in the renamer should not be utilized. 

[0020] An embodiment of the invention may reduce serialization penalties that 
occur if flags are not renamed. The embodiment may operate with relatively minimal 
hardware, such as data flops and decode logic in the renamer, additional bits in the 
shadow logic array, and additional data bits and control logic in the ROB. The 
embodiment may require less hardware than if flags are fully renamed, which may 
require components such as specific rename registers and register pointers. 

[0021] An embodiment of the invention may execute direct update instructions 
at the renamer, and thus it is not necessary to send the instructions to the ALU (arithmetic 
logic unit) of the processor, thereby improving system performance. In comparison with 
full renaming, an embodiment may provide better speed of operation because of 
reduction in the number of instructions that are executed in the ALU. 

[0022] Figure 1 illustrates an embodiment of a set of registers 105 including a 
flag 110. In this illustration, an instruction pipeline 115 includes parallel execution of 
instructions. The instructions include a first instruction (I-l) 120. a second instiruction 
(I-2)125,andathirdinstmction a-3)130. In the example, each of the insti^ctions is 
seeking to write to the flag 110. The instructions may include direct update instiruction 
and indirect update instructions. Under an embodiment of the invention, succeeding 
instructions are only stalled when the value of tiie flag is not known. In this example, if a 
prior instruction is a direct update instiruction. the value of the flag is known and 
succeeding instructions are not stalled. If a prior instruction is an indirect update 
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instruction, the value of the flag may not be known and succeeding instructions may be 
stalled until completion of the prior instiiiction. 

[00231 For example, if the first insti^ction 120 is a direct update instruction, 
then the second instruction 125 is not stalled. However, if second instruction 125 is an 
indirect update instruction and thus the value of flag 110 is not known, then the third 
instruction 130 may be stalled until the completion of the second instiiiction 125. 

[0024] Figure 2 illustrates an embodiment of a renaming architecture. Figure 
2 illustrates the operation of the embodiment, and is not intended to iUusti-ate physical 
strucmre. In the illustrated embodiment, a renamer 205 is utilized to rename registers, 
including a first flag 210 and a second flag 215. The flags may include, but are not 
limited to. an interrupt flag (IF) and a direction flag (DF). The flags may be set by varied 
instructions, including direct update instiiictions 220 and indirect update instructions 225. 
A multiplexer 235 is shown to illustrate tiie choice between the different types of 
instructions. 

[0025] To write to one of tiie flags, a direct update instiiiction 220 will check a 
scoreboard 255 to determine whether the flag is in use. If tiie scoreboard 255 indicates 
that the flag is in use, the instruction will stall. If the scoreboard 255 indicates that the 
flag is not in use, tiie direct update instruction 220 writes the value for tiie flag to tiie 
renamer 205. 

[0026] In the embodiment shown in Figure 3, an indirect update instiruction 225 
will store tiie data value in a re-order buffer 230. The indirect update instiniction 225 will 
also check tiie scoreboard 255 to determine whether the flag is available. If the flag is 
available, tiie indirect update instruction 225 will set the scoreboard 255 to prevent access 

Docket No: 42P 17033 "9- 
Express Mail No: EV 331619446 US 



by any other instruction. At retirement, the data value for the flag provided by the 
indirect update instruction 225 is stored to the lenamer 205 for the flag. The scoreboard 
255 then is cleared to allow access to the flag by other instructions. 

[0027] In addition, shadow logic 245 may store values of the flags 210 and 215 
to record prior values of the flags. However, an indirect update instruction 225 does not 
update values until retirement and thus should not be shadowed. A valid bit 250 is 
included in the shadow logic 245. The valid bit 250 is enabled for direct update 
instructions 220 and is disabled for indirect update instructions 225. 

[0028] Figure 3 is a flowchart of an embodiment of the invention. In this 
illustration, an instruction is received. If the insti^ction is a direct update instruction 310, 
the instruction checks a register scoreboard 315. If the scoreboard bit for the register is 
set 320, indicating that another instruction is utilizing the register, there is a delay and the 
instruction continues to check the scoreboard 315. When the scoreboard is no longer set 
320, the instruction sets the data value in the renamer 325. The shadowing of die register 
value is tiien enabled 328 by setting a bit in the shadow logic. 

[0029] If the instruction is not a direct update insti^ction 310, and thus is an 
indirect update instruction, the data value for the register is stored in a re-order buffer 
330. When the instruction is being retired 335, tiie instruction checks to determine 
whether the scoreboard bit for tiie flag is set 340. If the scoreboard is set 345, the 
instruction delays and continues to check tiie scoreboard 340. When the flag is no longer 
set 345, the instruction sets die scoreboard bit 350 to prevent access to the register before 
tiie value is provided to tiie renamer. When tiie insti^ction is retired 355, the data value 
is stored in die renamer 360. When the data value has been stored, tiie scoreboard bit for 
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theregisteriscleared365toallowaccesstotheregister. The shadowing of the register 
value is then disabled 370. The process continues with succeeding instructions. Multiple 
instructions in a pipeline may be processed simultaneously in the manner shown in 
Figure 3. 

[0030] Figure 4 illustrates an embodiment of a processor. The illustration is a 
simplified drawing and does not include all elements of the processor. In this illustration, 
the microprocessor 405 includes a front end section 410, execution logic 415, an 
execution unit 425, and memory 430. The execution logic 415 includes a renamer 420. 
The memory 430 may include one or more cache memories. The processor 405 
processes various instructions, including direct update instructions and indirect update 
instructions. The renamer 420 is used to handle the different types of instructions. Inthis 
embodiment, a direct update instruction that writes to a flag for the processor 405 does 
not cause stalling of a succeeding instruction that addresses the same flag. However, an 
indirect update instruction that writes to the flag may cause stalling of a succeeding 
instruction that writes to ti.e same flag because the value of the flag isn'tknown until the 

indirect update instiruction is retired. 

[0031] Techniques described here may be used in many different 
enviromnents. Figure 5 is block diagram of an embodiment of an exemplary computer. 
Under an embodiment of the invention, a computer 500 comprises a bus 505 or other 
communication means for communicating information, and a processing means such as 
one or more physical processors 510 (shown as 511, 512 and continuing through 513) 
coupled with the first bus 505 for processing information. Each of the physical 
processors may include multiple logical processors, and the logical processors may 
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op^te in parallel. According to a„ .mbodin,™. of inv«Uo„. > W"<^ " 

renamer to rename registers. 

[0032] The computer 500 ftirther comprises a random access memory (RAM) or 
other dynamic storage device as a main n»mor, 515 for storing information and 
i„stn.crions to be executed by the processors 510. Main memory 515 also may be used 
for storing temporary variables or other intermediate intonnarion during execution of 
i„structionsbyU,eprocessors510.mcompu,er500aIsoma,comprisean=adonly 

^ (ROM) 520 a^Wor other static storage device for storing staUc information and 

instrucUons tor the processor 510. 

100331 A data storage device 525 may also be coupled to the bus 505 of the 

computer 500 for s«>n„g information and instructions. The data storage device 525 may 
includeamagneticdiskor optica, disc andiU corresponding drive, flash memory or «her 

nonvolaUle memory, or other memory device. Such elements may he combined together 
or may be separate components, anduUlizeparts of otherelements of the computer 500. 

[00341 The computer 500 may also be coupled via the bus 505 to a display 
device 530. such asaliquidcrystal display (irWor other display technology.tor 
displayinginformaUon to an enduser.tesomeenvi,onments.the display device may be 

atouch-screenthatisalsouhlizedasatleastapartofaninputdevicclnsome 
environments, display device 530 may be ormay include «. auditory device, such as a 
speaker for providing auditory informarion. An input device 540 may be coupled to the 
bus 505 for communicaUng informaUon and/or command selections to the processor 510. 
h various implementations, input device 540 may be a keyboard, a keypad, a 
touch-sereen and stylus, a voice-acUvated system, or other input device, or combinaUons 
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of such devices. Another type of user input device that may be included is a cursor 
control device 545, such as a mouse, a trackball, or cursor direction keys for 
communicating direction information and command selections to processor 510 and for 
controlling cursor movement on display device 530. 

[0035] A communication device 550 may also be coupled to the bus 505. 
Depending upon the particular implementation, the communication device 550 may 
include a transceiver, a wireless modem, a network interface card, or other interface 
device. The computer 500 may be linked to a network or to other devices using the 
communication device 550, which may include Unks to the Internet, a local area network, 
or another environment. 

[0036] In the description above, for the purposes of explanation, numerous 
specific details are set forth in order to provide a thorough understanding of the present 
invention. It will be apparent, however, to one skilled in the art that the present invention 
may be practiced without some of these specific details. In other instances, well-known 
structures and devices are shown in block diagram form. 

[0037] The present invention may include various processes. The processes of 
the present invention may be performed by hardware components or may be embodied in 
machine-executable instructions, which may be used to cause a general-purpose or 
special-purpose processor or logic circuits programmed with the instructions to perform 
the processes. Alternatively, the processes may be performed by a combination of 

hardware and software. 

[0038] Portions of the present invention may be provided as a computer 
program product, which may include a machine-readable medium having stored thereon 
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instructions, which may be used to program a computer (or other electronic devices) to 
perform a process according to the present invention. The machine-readable medium 
may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and 
magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs. magnet or optical cards, 
flash memory, or other type of media / machine-readable medium suitable for storing 
electronic instructions. Moreover, the present invention may also be downloaded as a 
computer program product, wherein the program may be transferred from a remote 
computer to a requesting computer by way of data signals embodied in a carrier wave or 
other propagation medium via a communication link (e.g., a modem or network 
connection). 

[0039] Many of the methods are described in their most basic form, but 
processes can be added to or deleted from any of the methods and information can be 
added or subtracted from any of the described messages without departing from the basic 
scope of the present invention. It will be apparent to those skilled in the art that many 
further modifications and adaptations can be made. The particular embodiments are not 
provided to limit the invention but to illustrate it. The scope of the present invention is 
not to be determined by the specific examples provided above but only by the claims 
below. 

[0040] It should also be appreciated that reference throughout this specification 
to "one embodiment" or "an embodiment" means that a particular feature may be 
included in the practice of the invention. Similarly, it should be appreciated that in the 
foregoing description of exemplary embodiments of the invention, various features of the 
invention are sometimes grouped together in a single embodiment, figure, or description 
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thereof for the purpose of streamlining the disclosure and aiding in the understanding of 
one or more of the various inventive aspects. This method of disclosure, however, is not 
to be interpreted as reflecting an intention that the claimed invention requires mart 
features than are expressly recited in each claim. Rather, as the following claims reflect, 
inventive aspects lie in less than all features of a single foregoing disclosed embodiment. 
Thus, the claims are hereby expressly incorporated into this description, with each claim 
standing on its own as a separate embodiment of this invention. 
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